Marmara Turkish Coreference Corpus and Coreference Resolution Baseline

نویسندگان

  • Peter Schüller
  • Kübra Cingilli
  • Ferit Tunçer
  • Baris Gün Sürmeli
  • Aysegül Pekel
  • Ayse Hande Karatay
  • Hacer Ezgi Karakas
چکیده

We describe the Marmara Turkish Coreference Corpus, which is an annotation of the whole METU-Sabanci Turkish Treebank with mentions and coreference chains. Collecting nine or more independent annotations for each document allowed for fully automatic adjudication. We provide a baseline system for Turkish mention detection and coreference resolution and evaluate it on the corpus.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Corpus based coreference resolution for Farsi text

"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...

متن کامل

Corefrence resolution with deep learning in the Persian Labnguage

Coreference resolution is an advanced issue in natural language processing. Nowadays, due to the extension of social networks, TV channels, news agencies, the Internet, etc. in human life, reading all the contents, analyzing them, and finding a relation between them require time and cost. In the present era, text analysis is performed using various natural language processing techniques, one ...

متن کامل

Coreference resolution with syntactico-semantic rules and corpus statistics

A new hybrid approach to the coreference resolution problem is presented. The CORUDIS system (COreference RUles with DIsambiguation Statistics) combines syntactico-semantic rules with statistics derived from an annotated corpus. First, the rules and corpus annotations are described and exemplified. Then, the coreference resolution algorithm and the involved statistics are explained. Finally, th...

متن کامل

Large Corpus-based Semantic Feature Extraction for Pronoun Coreference

Semantic information is a very important factor in coreference resolution. The combination of large corpora and ‘deep’ analysis procedures has made it possible to acquire a range of semantic information and apply it to this task. In this paper, we generate two statistically-based semantic features from a large corpus and measure their influence on pronoun coreference. One is contextual compatib...

متن کامل

Can Projected Chains in Parallel Corpora Help Coreference Resolution?

The majority of current coreference resolution systems rely on annotated corpora to train classifiers for this task. However, this is possible only for languages for which annotated corpora are available. This paper presents a system that automatically extracts coreference chains from texts in Portuguese without the need for Portuguese corpora manually annotated with coreferential information. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1706.01863  شماره 

صفحات  -

تاریخ انتشار 2017